Delay Asymptotics with Retransmissions and 
Incremental Redundancy Codes over Erasure 

Channels 

Yang Yangt, Jian Tan*, Ness B. Shroff^, Hesham El Gamal^ 
^Department of Electrical and Computer Engineering 
The Ohio State University, Columbus 43210, OH 
*IBM T.J. Watson Research, Hawthorne 10532, NY 



Abstract — Recent studies have shown that retransmis- 
sions can cause heavy-tailed transmission delays even when 
packet sizes are light-tailed. Moreover, the impact of heavy- 
tailed delays persists even when packets size are upper 
bounded. The key question we study in this paper is 
how the use of coding techniques to transmit information, 
together with different system configurations, would affect 
the distribution of delay. To investigate this problem, we 
model the underlying channel as a Markov modulated binary 
erasure channel, where transmitted bits are either received 
successfully or erased. Erasure codes are used to encode 
information prior to transmission, which ensures that a fixed 
fraction of the bits in the codeword can lead to successful 
decoding. We use incremental redundancy codes, where the 
codeword is divided into codeword trunks and these trunks 
are transmitted one at a time to provide incremental redun- 
dancies to the receiver until the information is recovered. 
We characterize the distribution of delay under two different 
scenarios: (I) Decoder uses memory to cache all previously 
successfully received bits. (II) Decoder does not use memory, 
where received bits are discarded if the corresponding 
information cannot be decoded. In both cases, we consider 
codeword length with infinite and finite support. From a 
theoretical perspective, our results provide a benchmark to 
quantify the tradeoff between system complexity and the 
distribution of delay. 

I. Introduction 

Retransmission is the basic component used in most 
medium access control protocols and it is used to ensure 
reliable transfer of data over communication channels 
with failures [ 1 ] . Recent studies Q have revealed 
the surprising result that retransmission-based protocols 
could cause heavy-tailed transmission delays even if the 
packet length is light tail distributed, resulting in very long 
delays and possibly zero throughput. Moreover, [0 shows 
that even when the packet sizes are upper bounded, 
the distribution of delay, although eventually light-tailed, 
may still have a heavy-tailed main body, and that the 
heavy-tailed main body could dominate even for relatively 
small values of the maximum packet size. In this paper 
we investigate the use of coding techniques to transmit 
information in order to alleviate the impact of heavy tails, 
and substantially reduce the incurred transmission delay. 



In our analysis, we focus on the Binary Erasure Channel. 
Erasures in communication systems can arise in different 
layers. At the physical layer, if the received signal falls 
outside acceptable bounds, it is declared as an erasure. At 
the data link layer, some packets may be dropped because 
of checksum errors. At the network layer, packets that 
traverse through the network may be dropped because 
of buffer overflow at intermediate nodes and therefore 
never reach the destination. All these errors can result in 
erasures in the received bit stream. 

In order to investigate how different coding techniques 
would affect the delay distribution, we use a general 
coding framework called incremental redundancy codes. In 
this framework, each codeword is split into several pieces 
with equal size, which are called codeword trunks. The 
sender sends only one codeword trunk at a time. If the 
receiver cannot decode the information, it will request 
the sender to send another piece of the codeword trunk. 
Therefore, at every transmission, the receiver gains extra 
information, which is called incremental redundancy. 

In order to combat channel erasures, we use erasure 
codes as channel coding to encode the information. Era- 
sure codes represent a group of coding schemes which 
ensure that even when some portions of the codeword 
are lost, it is still possible for the receiver to recover 
the corresponding information. Roughly speaking, the 
encoder transforms a data packet of I symbols into a 
longer codeword of l c symbols, where the ratio (3 — l/l c 
is called the code-rate. An erasure code is said to be near 
optimal if it requires slightly more than I symbols, say 
(1 + e)l symbols, to recover the information, where e can 
be made arbitrary small at the cost of increased encoding 
and decoding complexity. Many elegant low complexity 
erasure codes have been designed for erasure channels, 
e.g., Tornado Code Q, LT code JSJ, and Raptor code 
J9). For the sake of simplicity, throughout the paper, we 
assume e = 0. In other words, any j3 fraction of the 
codeword can recover the corresponding information and 
a lower (3 indicates a larger redundancy in the codeword. 

We specify different scenarios in this paper. In the 
first scenario, as shown in Fig. [T] the entire codeword 
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Fig. 1. Decoder that does not use memory scenario 



is transmitted as a unit, and received bits are simply 
discarded if the corresponding information cannot be 
recovered. Note that in this scenario, the decoder memory 
is not exploited for caching received bits across different 
transmissions. This scenario occurs because the receiver 
may not have the requisite computation/storage power 
to keep track of all the erasure positions and the bits 
that have been previously received, especially when the 
receiver is responsible for handling a large number of 
flows simultaneously. In the second scenario, we assume 
that the receiver has enough memory space and compu- 
tational power to accumulate received bits from different 
(re)transmissions, which enables the use of incremental 
redundancy codes, where a codeword of length l c is 
split into r codeword trunks with equal size, and these 
codeword trunks are transmitted one at a time. At the 
receiver, all successfully received bits from every trans- 
mission are buffered at the receiver memory according 
to their positions in the codeword. If the receiver cannot 
decode the corresponding information, it will request the 
sender to send another piece of codeword trunk. At the 
sender, these codeword trunks are transmitted in a round- 
robin manner. We call these two scenarios Decoder that 
does not use memory and Decoder that uses momery, 
respectively. 

Given the above two different types of decoder, there 
are two more factors that can affect the distribution of 
delay. (I) Channel Dynamics: In order to capture the time 
correlation nature of the wireless channels, we assume 
that the channel is Markovian modulated. More specifi- 
cally we assume a time slotted system where one bit can 
be transmitted per time slot, and the current channel state 
distribution depends on channel states in the previous 
k time slots. When k = 0, it corresponds to the i.i.d. 
channel model. (II) Codeword length distribution: We 
assume throughout the paper that the codeword length 
is light tail distributed, which implies that the system 
works in a benign environment. We consider two different 
codeword length distributions, namely, codeword length 



with infinite support and codeword length with finite 
support, respectively. For the former, the codeword length 
distribution has an exponentially decaying tail with decay 
rate A, for the latter, the codeword length has an upper 
bound b. 

Contribution 

The main contribution of this work is the following: 

• When decoder memory is not exploited, the tail of the 
delay distribution depends on the code rate. Specif- 
ically, we show that when the coding rate is above 
a certain threshold, the delay distribution is heavy 
tailed, otherwise it is light tailed. This shows that 
substantial gains in delay can be achieved over the 
standard retransmission case (repetition coding) by 
adding a certain amount of redundancy in the code- 
word. As mentioned earlier, prior work has shown 
that repetition coding results in heavy tailed delays 
even when the packet size are light tailed. 

• When decoder memory is exploited, the tail of the 
delay distribution is always light-tailed. This implies 
that the use of receiver memory results in a further 
substantial reduction in the transmission delay. 

• The aforementioned results are for the case when 
the codeword size can have infinite support. We 
also characterize the transmission delay for each of 
the above cases when the codeword size has finite 
support (zero-tailed), and show similar tradeoffs be- 
tween the coding rate and use of receiver memory 
in terms of the main body of the delay distribution 
(rather than the eventual tail). 

The remainder of this paper is structured as follows: In 
Section |TTJ we describe the system model. In Section III 
we consider the scenario where the decoder memory is 



exploited. Then, in Section IV we investigate the situation 
where the decoder does not use memory Finally, in 
Section^} we provide numerical studies to verify our main 
results. 

II. System Model 

The channel dynamics are modeled as a slotted system 
where one bit can be transmitted per slot. Furthermore, 
we assume that the slotted channel is characterized by 
a binary stochastic process {X n } n >x, where X n = 1 
corresponds to the situation when the bit transmitted at 
time slot n is successfully received, and X n = when the 
bit is erasured (called an erasure). 

Since, in practice, the channel dynamics are often 
temporarily correlated, we investigate the situation in 
which the current channel state distribution depends on 
the channel states in the preceding k time slots. More 
precisely, for J 7 ,, = {Xi}i< n and fixed k, we define 
H n = {X ni . . . , X n _ k+1 } for n > k > 1 with H n = {0, n} 
for k = 0, and assume that P[X n = = P[X„ = 

lI'Hn-i] for all n > k. To put it another way, the aug- 
mented state Y n = [X n , . . . , X n -k], n > k forms a Markov 



chain. Let II denote the transition matrix of the Markov The transmission delay is defined as Tm = Nm ~/L c /r. 
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with 7r(s, u) being the one-step transition probability from 
state s to state u. Throughout this paper, we assume that 
II is irreducible and aperiodic, which ensures that this 
Markov chain is ergodic [10]. Therefore, for any initial 
value Hk, the parameter 7 is well defined and given by 

7 = lim P[X n = 1], 

n— foo 

and, from ergodic theorem (see Theorem 1.10.2 in [TTolO 
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which means the long-term fraction of the bits that can 
be successfully received is equal to 7. Therefore, we call 
7 the channel capacity. 

In the degenerated case when k — 0, we have a 
memoryless binary erasure channel (i.i.d. binary erasure 
channel). Correspondingly, H n = {0,^} and II = [7]. 

As mentioned in the introduction, we study two dif- 
ferent scenarios in this paper, namely decoder that uses 
memory and decoder that does not use memory. In the first 
scenario, the sender splits a codeword into r codeword 
trunks with equal size and transmits them one at a time 
in a round-robin manner, while the receiver uses memory 
to cache all previously successfully received bits according 
to their positions in the codeword. In the second scenario, 
the receiver discards any successfully received bits if they 
cannot recover the corresponding information, and the 
sender transmits the entire codeword as a unit. 

We let L c denote the number of bits in the codeword 
with infinite support, and assume that there exist A > 
and z > such that 

logP[a; < L c < x + z] 



lim 



= -A. 



(1) 



We let L c (b) denote the number of bits in the codeword 
with finite support, with b being the maximum codeword 
length, and let P[L c (b) > x] = V[L C > x\L c < b] for 
any x > 0. We focus on erasure codes, where a fixed 
fraction (0 < /3 < 1) of bits in the codeword can lead to 
a successful decoding. We call this fraction (3 code-rate. 

Formal definitions of the number of retransmissions and 
the delays are given as follows: 

Definition 1 (Decoder that uses memory) . The total num- 
ber of transmissions for a codeword with variable length L c 
and number of codeword trunks r when the decoder uses 
memory is defined as 
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Definition 2 (Decoder that does not use memory). The 
total number of transmissions for a codeword with variable 
length L c when the decoder does not use memory is defined 
as 

N f 4 inf L : ^ A ( „_ 1)ic+l > /3L C | . 

The transmission delay is defined as Tj = NfL c . 

For a codeword with variable length L c (b), the corre- 
sponding numbers of transmissions and delays are de- 
noted as N!n } (b),T^ } (b),N f (b), and T f (b), respectively. 

Notations 

In order to present the main results, we introduce some 
necessary notations here. 

Notation 1. Let p(M) denote the Perron-Frobenius eigen- 
value (see Theorem 3.11 in /[IIP ) of the matrix M, which 
is the largest eigenvalue of M. 

Notation 2. For k > 1, let {si} 1<i<2 k — {0, l} k denote 
the state space of {Y n } n > k+ i, where s* = [sn,s. l2 , s ik ] 
and Sij € {0,1} Vi,j. Then, we define a mapping f from 
{si}i<i<2* to {0, 1} as 

f(Si) = 1 - s ik . 

Notation 3. Let A„(/3,II) denote the large deviation rate 
function, which is given by 



a„ os, n) - su P {6(1 -p)- log Pn (0, n)} , 



pie 



k > 1 



Pn(6>,II) = 

[ (l-7)" + (l-(l- 7 )")e e k = 
D = diag [/( Sl ), f(s 2 ), f(s 2k )} for k > 1. 

Notation 4. Let p n denote the root of the rate function 
A n (/3,n). More precisely, 



A n (/i„,n) =0. 



Notation 5. 



a = inf {n : p n > f3} 

A + A n (/3,n)l(n>a) 



A? = inf 

nGN 



An = inf 

nGN 



n + 1 

A + An + iG8,II)l(n> a- 1) 
n + 1 



A 



v ,_, > if/?>7 



'For a matrix A, A® n is the n-fold Kronecker product of A with itself, 
or we can call it the n th Kronecker power of A. 



III. Decoder that uses Memory 

When the decoder uses memory to cache all previously 
successfully received bits, we can apply incremental re- 
dundancy codes, where the sender splits a codeword into 
r codeword trunks and transmits one codeword trunk at 
a time. If the receiver, after receiving a codeword trunk, is 
not able to decode the corresponding information, it will 
use memory to cache the successfully received bits in the 
codeword trunk and request the sender to send another 
codeword trunk. In this way, at every transmission, the 
receiver gains extra information, which we call incre- 
mental redundancy. The sender will send these codeword 
trunks in a round-robin manner, meaning that if all of 
the codeword trunks have been requested, it will start 
over again with the first codeword trunk. It should be 
noted that incremental redundancy code is a fairly general 
framework in that if r = 1, it degenerates to a fixed rate 
erasure code, while as r approaches infinity, it resembles 
a rateless erasure code. 

A. Codeword with infinite support 

When the distribution of codeword length L c has an 
exponentially decaying tail with decay rate A, as indicated 
by Equation ([l]), we find that the delay will always 
be light-tailed, and we characterize the decay rate in 
Theorem [T] 

Theorem 1. In the case when the decoder uses memory, 
when we apply incremental redundancy code with param- 
eter r to transmit codeword with variable length L c , we 
obtain a lower and upper bound on the decay rate of delay, 
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In the special case when r = 1, 



lim 

n— >oo 



log I 



T (1) > t 



<min{A°,A°}, 
>min{A°,A°}. 

= min{A?,A}. 



The definitions of A°, A£ and A3 can be found in Notation 



Proof: see Section VI-B 



Remark 1.1. From the definitions of A°,Ag and A3 in 
Notation [5] we observe that firstly, the decay rate of delay 
when r = 1 is no greater than the decay rate of delay 
when r > 1 (min{AJ,A} < min{A°,Ag}), which means 
that incremental redundancy code (r > 1) outperforms 
fixed rate erasure code (r = 1); secondly, the decay rate 
of delay increases with the increase of r, which means we 
can reduce delay by increasing the number of codeword 
trunks r. These observations are verified through Example 
1 in Section M 



B. Codeword with finite support 

In practice, codeword length is bounded by the maxi- 
mum transmission unit (MTU). Therefore, we investigate 
the case when the codeword has variable length L c (b), 
with b being the maximum codeword length, and char- 
acterize the corresponding delay distribution in Theorem 

121 

Theorem 2. In the case when decoder uses memory, when 
we apply incremental redundancy code with parameter r to 
transmit codeword with variable length L c (b), we get 
1 ) for any r\ > and any bo > 0, we can find b{rj) > such 
that for any b > b(rj), we have Vf <E [n^ib — 60), n^b], 



(1-»7)A5<- 



logl 



T { Z\b)>t 



<(1 + j?)A|. 



2) in the special case when r = 1, for any 77 > and any 
b > 0, we can find 6(77) > such that for any b > b(rf), we 
have Vt e [n° x {b - b ),nfbl 



(l-r?)A b < 



log I 



T£\b)>t 
t 



<(l + r?)A b , 



where 



n\ = arg inf (A + A n (/3, n)l(n > a))/(n + 1), 



arg inf (A + A n+1 (/3, U)l(n >a- l))/(n + 1), 

nGN 

A° + min{0, A° - A°}l(n° = 1), 



A b =A° + min{0,A°-A°}lK = l), 
A b = A° + min{0, A - A?}l(n? = 1). 



Proof: see Section VI-C| 



Remark 2.1. This theorem shows that even if the code- 
word length has an upper bound b, the distribution of 
delay still has a light-tailed main body whose decay rate 
is similar as the decay rate of the infinite support scenario. 
The waist of this main body is n^b when r > 1 and n°b 
when r = 1. Since both n% and n\ are independent of 
b, we know that the waist of this light-tailed main body 
scales linearly with respect to the maximum codeword 
length b. This theorem is verified through Example 2 in 
Section 

IV. Decoder that does not use Memory 

For receivers that do not have the required computa- 
tion/storage power, it is difficult to keep track of all the 
erasure positions and the bits that have been successfully 
received. Therefore, in this section, we study the case 
when the decoder does not use memory, as illustrated in 
Fig. [T] In this situation, since the receiver simply discards 
any successfully received bits if they cannot recover the 
corresponding information, it is better for the sender to 
transmit the whole codeword as a unit instead of dividing 
the codeword into pieces before transmission. 



A. Codeword with infinite support 

Interestingly, we observe an intriguing threshold phe- 
nomenon. We show that when the codeword length dis- 
tribution is light-tailed and has an infinite support, the 
transmission delay is light-tailed (exponential) only if 
7 > p, and heavy-tailed (power law) if 7 < /?. 

Theorem 3 (Threshold phenomenon). In the case when 
decoder does not use memory and the codeword has variable 
length L c , we get 
1) if j3> 7, then 

logP[iV> > n] 



lim 



log n 



= lim 

t— ¥00 



logP[3> >t] 



logt 
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2) if(3< 7, then 

logPp> > t] 



lim 

t— ^00 



t 



min{A,Ai(/3,n)}. 



The definition of Ai(/3, II) can be found in Notation^ 
Proof: see Section [VI-D 



Remark 3.1. The tail distribution of the transmission 
delay changes from power law to exponential, depending 
on the relationship between code-rate (3 and channel 
capacity 7. If A/A 1 (/3,n) < 1, the system even has a zero 
throughput. 

B. Codeword with finite support 

Under the heavy-tailed delay case when j3 > 7, we 
can further show that if the codeword length is upper 
bounded, the delay distribution still has a heavy-tailed 
main body, although it eventually becomes light-tailed. 

Theorem 4. In the case when decoder does not use memory 
and the codeword has variable length L c (b), if /3 > 7, for 
any 77 > 0, we can find n(rj) > and b(rj) > such that 
1 ) For any b > b(rj), we have Vn G [n(ri),n b ], 

1 logP[iV / (6)>n]A 1 (/j,n) ^ | 

logn A 



2; 



lim l gP[AT / ( & )>,] =lo p 

n— »oo 71 



3j For any 6 > 6(77), we have V£ € [77(77)6, nf>6]., 



! < logP[T / (b)>t]A 1 (/3,n) ^ | 
log* A 



4; 



^ logP^^t] 1 1q 



where 



n h = (¥[N f = 1\L C = b})- 1 . (2) 
The definition of Ai(/3, II) can be found in Notation^ 



Proof: see Section VI-E ■ 

Remark 4.1. From Equation Q and by Lemma [3} we can 
obtain 

log n b 



lim 



= A 1 ( / 3,n), 



(3) 



which implies that ?i& increases exponentially fast with the 
increase of maximum codeword length b. Since the waist 
of the heavy-tailed main body of the delay distribution is 
ribb, we know that the waist also scales exponentially fast 
as we increase the maximum codeword length b. 

From Theorem |4] we know that even if the codeword 
length is bounded, the heavy-tailed main body could still 
play a dominant role. From Theorem [3] we know that 
when A < Ai(/3, II) and (3 > 7, the throughput will vanish 
to zero as b approaches infinity. Now we explore how fast 
the throughput vanishes to zero as b increases. 

Let {Li}i>i be the i.i.d. sequence of codeword lengths 
with distribution L c (b). Denote T, ; as the transmission 
delay of L^. The throughput of this system is defined as 
A(6) = lim™ YJU PLil ELi T t . 

Theorem 5 (Throughput). In the case when decoder does 
not use memory and the codeword has variable length L c (b), 
if /3 > 7 and A < Ai(/3, II), we have 

- lim sup lQg f (6) > Ai(/?,II) - A. 

The definition of A 1 (/3,II) can be found in Notation^ 

Proof: see Section |VTF| ■ 

Remark 5.1. Theorem [5] indicates that when code-rate (3 
is greater than channel capacity 7 and A < Ai(/3,II), as 
the maximum codeword length b increases, the through- 
put vanishes to at least exponentially fast with rate 

Ai(/3,II)-A. 

V. Simulations 

In this section, we conduct simulations to verify our 
main results. As is evident from the following figures, the 
simulations match theoretical results well. 

Example 1. In this example, we study the case when the 
decoder uses memory and the codeword length has infi- 
nite support. We assume that the channel is i.i.d. (A: = 0). 
As shown in Theorem [T] under the above assumptions, 
the delay distribution is always light-tailed. In order to 
verify this result, we assume that L c is geometrically 
distributed with mean 100 (A = 0.01), and choose code- 
rate /3 = 0.5 and channel capacity 7 = 0.25. By Theorem 
[T] we know that when r = 1, the decay rate of delay is 
min{A°, A} = 0.0025; when r = 3, the decay rate of delay 
is min{AJ,A§} = min{A§,A§} = 0.0037; when r = 5, 
the decay rate of delay is min{AJ, A§} = minjA^, A§} = 
0.0042. From Fig.[2]we can see that the decay rate of delay 
increases when r increases from 1 to 5, and the theoretical 
result is quite accurate. 
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Fig. 2. Illustration for Example 1 
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Fig. 3. Illustration for Example 2 



Example 2. In this simulation, we study the case when 
the decoder uses memory and the codeword length has 
a finite support. We assume that the channel is i.i.d. 
(fc = 0), code-rate f3 = 0.75, A = 0.01, r = 1, and channel 
capacity 7 = 0.1. From these system parameters we can 
calculate n° = 14 and A b = min{A°,A} = 7.1429 x 10~ 4 . 
We choose four sets of maximum codeword length b 
as 200, 400, 600, 800. Theorem [2] indicates that the delay 
distribution has a light-tailed main body with decay rate 
A h = 7.1429 x 10~ 4 and waist n b b = 14 x b. In Fig. [|]we 
plot the delay distributions when b = 200, 400, 600, 800 
together with the infinite support case when b = 00, 
and we use a short solid line to indicate the waist of 
the light-tailed main body. As we can see from Fig. [5] 
the theoretical waists of the main bodies, which are 
n b b = 14 x b = 2800,5600,8400,11200, are close to the 
simulation results. 

Example 3. Now we use simulations to verify Theorem[4] 
Theorem [4] says that when the decoder does not use mem- 
ory if code rate (3 is greater than channel capacity 7 and 
the codeword length has a finite support, the distribution 
of delay as well as the distribution of number of retrans- 
missions have a heavy-tailed main body and an exponen- 
tial tail. The waist of the main body increases exponen- 
tially fast with the increase of maximum codeword length 
b. In this experiment, we set code-rate j3 = 0.25, channel 
capacity 7 = 0.20, k = 0, and A = 0.01. From these 
parameters we can get Ai(/3, Ft) = 0.0074. We choose four 
sets of maximum codeword length b as 200, 400, 600, 800. 
As Equation ^ indicates, the waist of the heavy-tailed 
main bodies of the number of retransmissions is n& rj 
e 6A!(/3,n) = 4.3772,19.1595,83.8641,367.0865. In Fig. g 
we plot the distribution of the number of retransmissions 
when b = 200,400,600,800 together with the infinite 
support case when b = 00, and we use a short solid line 



to indicate the waist of the heavy-tailed main body. As 
can be seen from Fig. [4] the simulation matches with our 
theoretical result. 
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Fig. 4. Illustration for Example 3 

VI. Proofs 

A. Lemmas 

In order to prove the theorems, first we need the 
following three lemmas. 

Lemma 1. 

P[A« > n\L c = / c ] = e -^A„ W ,n)i(„>a) +s „(^) ) 

where 

„ n \ c / °(U ifn>a 
9n[ c) fe \ o(l) otherwise ' 



Proof: First we consider the case when k > 1. By 
Definition [TJ we have 



[AT« > n \L c = l c ] 

- L a / n \ 

E 1 E x ci-Dic+*>i <^ 
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Lc, [J £j 



3=1 



(4) 



where = {X (j -_ 1)jLc+1 , . . . , I (j _i) Lc+l }, 1 < j < n. 
Let K m = y Lc +i, . . • ,F(„_i) ic+i ] and 



fn (Yin) = II / ( y 0--D^ 



3=1 



If L c > fc, then given U"=i£j> { Y in)k<i<L c forms a 
Markov chain with state space {{0, l} k } n and probability 
transition matrix II®". We further observe that if L c > k, 
we have the following relationship 



E H 1 [ x u-^+i = ) >o--P)Lc\ 

^ l E ft 1 ( X 0-i)^+i = o) >(1 - /3)£ c j 

{ £ n i ( X (i-i)ic-H = o) >(l-P)L c -k\. 
[i=i+kj=i ^ ' ) 
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Using the above observation, we can construct upper and 
lower bounds as follows. 
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-1+k 



Lc, |J f j 

3 = 1 . 



3=1 . 



3=1 



(5) 



By a direct application of Theorem 3.1.2 in H11L we know 
that for a given e > and any values of (J?=i £j> we can 



find i s such that 

E fn(Y ln ) > (l~P)L c 

_i=l+k 

> g-infi-oox-p A„(^,n)(l+e)i c 

E /n{yin)>(l-jJ)Ic-* 
_i=l+fc 

< e -infi_ w>1 _ p A n 03,n)(l-6)Jc 



(6) 



(7) 



whenever l c > l e . Since A„(cj,n) is a large deviation rate 
function, from [11 111 we know that 



inf A n (w,n) 
i-w>i-£ 



A„(/3,n) if/x»</3 
otherwise 

AnC8,II)l(n>a). 



(8) 



The upper and lower bounds ^ and (|6]), together with 
Equation Q, ([5]) and §8§, imply that 



.. . ,logP[iV m > n\L. 
lim mi 



< (l + e)A„(/3,n)l(ra> a), 

logP[iV ro >n\L c = l c ] 
— lim sup 

l c — foo *c 

> (l-e)A n (/?,II)l(n> a), 

which, with e — > 0, completes the proof when fc > 1. Next, 
let us consider the case when fe = 0. 

In this memoryless channel case, for a single bit in 
the codeword, after n th transmission, the probability that 
this bit is successfully received is 1 — (1 — 7)". Therefore 
equivalently we can consider a single transmission in a 
memoryless channel with erasure probability (1 — 7)™. 
Then, by a direct application of Gartner-Ellis theorem 
(Theorem 2.3.6 in [11]), we have, for any n > 1, 

r ^ggjfe > n \Lc = lc] a (a rni ( \ 
hm = -A n (/3, 11)1 (n > a) , 

l c — ¥00 l c 
where 

A„(/3,n) = sup {9(1 -13)- log (E [X ie ex <])} 
e 

= sup{0(l -(3)- log (1 - (1 - 7)" + (1 - lYe 6 )} 



log 



l-(l-7) f 



(1-/3) log 



1-/? 
(1-7)"' 



and a 



log(l — f) 



Lemma 2. Assume 6 is a function of t, which satisfies b = 
b(t) > -. Then, /or any ijeNwe have 



logP 



lim ■ 

t— > 00 



t 

A + A X (P, IL)l(x > a) 
y + i 



Proof: 

pU>x,4 T <i c (i)< 1 

y + l y 

[t/vl 

E 

2o=r*/(»+i)i 



Ar(i) 



> X 



> x 
L c (b) = 



L c (b) = I. 
t 



y + l 



>[L c (b)=Q 



~T < L c (b) < - 

y + i y 



(9) 



From Lemma [T] we know that 



ATM 



> x 



L c (b) 



y+l 



(10) 



Since b — b(t) > |, by the definition of L c (b), we can 
easily obtain 

logP 



lim 

t— ^oo 



< io(6) < * 



y + i 



(ID 



Combining Equation (111, (101 and Q, we get 



lim sup 

t— >oo 



log I 



>x,-^<L c (b) < I 



<-(X + A x (P,n)l(x>a))/(y + l). 

The lower bound can be constructed in a similar manner. 

■ 

Lemma 3. 1) if /3 > 7, then 

P [JV> > n\L c = l c ] = (l- e -««Ai(^,n)(i+ 9 («) 

where g(l c ) e o(l) as l c — > oo. 
2) if /3 < 7, then 

P [N f > n\L c = l c ] = e -»'«AiOS,n)(i+.(j e )) j 

where s(l c ) e o(l) as Z c — > oo. 
Proof: From Definition [2] we know 

P[JV> > n|L c ] 

X Xi<pL c \ L,. 



=E 



=E 



n 

l<i<n ^i=(i-l)i c +l 

n I e 

l<j<n [j=(j-l)L c +l 



j'=i 



it 



Y x i< PLc 
i=(j-i)i c +i 



(12) 



where ^ = {X (j _ 1)Lc+1 , . . . , X {j _ 1)Lc+k }, 1 < j < n. 
The last equation is due to the Markov property of the 



c 



c 



channel states. Observe that if L c > k, for any 1 < j < n, 

jL c 

Y x i< p l c 

i={j-l)L e +l+k 

Y x< p l c 

i=(j-l)L c +l 

Y x i< p l c 

i=(j-l)L c + l+k 

which further yields 

jL c n 

Y X % <(3L c -k \j£i,L c 

i=(j-l)L c + l + k i=l 
jL c n 

Y X i<P L c \J^,L C 

i={j-X)L c +\ i=l 

jLc n 

Y x i<P L c \j£ t ,L c 

i=(j-l)Lc + l + k i=l 

Similarly as the proof of Lemma [T] by Theorem 3.1.2 in 
Hill , we obtain, for any 1 < j < n 



lim 



logP 



■jL 

■i=V-l)L e + 



l X i > f3L c \ 1J" =1 Si, L t 



-A 1 (/3,n)l(/3> 7 ), 



lim 



log] 



s=o-i)l c +i x i - ur=i £»i l 



c — L c 



= -A 1 (/3,n)l(/3< 7 ), 

which, by combining Equation [T2| completes the proof. 

■ 

B. Proof of Theorem [T| 

proof of Theorem [I] Observe that 

p |r (r) 

m 



> t 



m >h h + 1 c - h 



= E ] 

h—r 
oo (n+l)r-l 

: E E ■ 

n—1 h—nr 

+ p[t$ >t,L c >t 



T^>t,L c >t 



tr 



tr 



N % )>h >l^ <L °*h 



(13) 



Let us first focus on the first part of Equation (131. Denote 

(n+l)r-l- 



p _ vM™~ 
K ntr — / ,h — 

easy to check that 



then it is 



< 



> 



Ntf > n, < L c < - 



N^>n+1,^—<L C < - 
n + 1 n 



which, by Lemma [2j yield 



.. logP„ tr A + A„(/3,n)l(n>a) 

lira sup < , 

t^oo t n + 1 

liminf l0gF " fr > * + An+i(/3,n)l(n>a-l) _ (M) 
t->-oo rj n + 1 



For the second part of Equation (131, we have, by the 
definition of L c , 



lim 

t— >oo 

lim t _ 
lim f _ 



logP 



t 

logP[L e >t] 



logP[L 



^wmi if/3 



if p >7 



< 



A 



if > 7 



Ai if/3<7 ~ A 3- 



(15) 



fr/3/7] 



Combining Equation (13]), ( [14] ) and ( |15] ), we get 

IokP 



lim sup ■ 



T (r) . 



- max < lim sup 

^ t— VOO 



lim 

t— J-oo 



log I 



t 



Ti r) >t,L c >t 



t 



( <W^ A + A "^ n ^^,-Ag 
n n + 1 d 



min{A?,A°}. 



(16) 



The lower bound can be found in a similar manner. 
Notice that inequality (a) in the preceding equation is 
true because P ntr is nonzero only for a finite number of 
n, which is due to the fact that L c cannot be less than 1. 

In the special case when r = 1, by Lemma [2] and the 
definition of L c , we have 

.. logP„ tr A + A n (/?,n)l(n>a) 

lim = , 

t-s-oo t n + 1 



lim 

t— >oo 



log I 



(17) 



which, by combining Equation (131, completes the proof. 



C. Proof of Theorem |2] 

proof of Theore m |2| From the definition of ng, A° 
and A£ in Notation [5j and by Lemma [2] we can obtain, 
for any b > 0, 



logP 



lim sup — - 

t— >oc 



T (r) . 

■*- Til b i 



Then, for any 77 > and for any b > 0, we can find t{rj) 
such that 



logP 



Tin ] > t, -4tt < L c (\ + b ) < "4 



log I 



Tin ] > t, < L c ( A + bo] < -4 



< (l+r?)A2, 



> (l-r?)A?, 



whenever i > 4(77). We denote 6(77) = + fe - In other 
words, for any b > 6(77), whenever t £ [(6 — &o) n 2; ^2]» 



logP Ti r) > t, ^ < L c (b) < 



logP Ti r) > t, < L c (b) < 



<(1 + V)A°2, 



> (1 - »)A?, 



which, by using the same technique as in Equation (|T6j), 
completes the proof of the first part. The second part of 
Theorem [2] follows by noting that 



lim — - 

t — ► 00 



logP[T«>t, n i r <L c (^ + 6 )< r | 



— A° 



where the definition of n° can be found in Notation [5] 

■ 

D. Proof of Theorem [3] 

proof of Theorem |3j 1) If /3 > 7, by Lemma [3] for 
any e > 0, we can find l e such that 

F[N f > n\L c = l c ] > (l - e-ioAtCWIXi-e))" ( 
whenever l c > l e . Then we have, for n large enough, 

F[N f > n] = E pP[JV> > ?i|L c ]] 



> E 

> E 

> E 



L c > l e , (l - e-MAnXi-e)^ 

logn r logn 



< L r < 



AiGS.IlXl-e) Ax^nja-e) 

^1 - e -Ai(/3,n)(i- £ )i 

log n 



-A 1 (/3,n)(l-e) 

^ _ e -(logn)^" 



<L C < 



logn 
A 1 (/3,n)(l-e) 



-(log 



Taking logarithms on both sides of the preceding inequal- 
ity, we get 



1 . m . nf logP[iV / >n] > 



A(l + e) 



lim inf 

t— >oo 



logp[T«>t,^ FT <£ c (4 + 6o)<^ 



< A° 



> A°. 



logn " AidS.nJCl-e)' 

which, when e — > 0, results in the lower bound. 

Next, we prove the upper bound. Using the same 
technique as in the proof of the lower bound, and by the 
definition of L r , we can find L such that 



¥[N f >n]< 



Lc> i ei (i- e -^mi+e)L c y 

+ ¥[N f >n,L c < l e ] 

OO 

< Y, f 1 - e- A ^ n ^ 1+ ^y ¥[L C = I] + 0(e"* B ), 
z=z e 

< o Qf°° (1 - e - Ai ^ n )< 1+£ )*) V^ 1 -^*^ 

+ 0(e-« n ). 

Computing the integrated in the preceding inequality, we 
obtain 

logP[AT / >n] . A(l-e) 

lim sup — < - - — , - —-, , 

„^oo logn " A 1 (/3,n)(l + e)' 

which, with e — > 0, proves the upper bound. 

Now, we prove the result for ¥[Tf > t]. The upper 
bound follows by noting that 

Pp> >t] < ¥[N f L c >t,L c < h\ogt]+¥[L c > hlogt] 
< ¥[N f > t/(h log t)] +¥[L C > hlogt], 

where limt-^ logP[JV/ > t/ (hlogt)]/ log t = A/Ai(/3,II) 3 
and ¥[L C > hlogt] = o(¥[N f > t/ (hlogt)]) for h large 
enough. 

The lower bound follows by noting that, for some l 2 > 
h>0 with ¥[h < L c < l 2 ] > 0, 

¥[T f >t]> ¥[N f L c >t,h<L c < h] 

> ¥[N f > t/h]¥[h <L C < l 2 ]. 

2) Observe that 

¥[T f > t] 

OO 

=P[L c >t] + ]T] 



t T t 
N f > n, — — <L C < - 
n + 1 n 



(18) 



Using the same technique as in the proof of Lemma [2] and 
by Lemma [3j we have, when /3 < 7, 



log I 



Nf >n,^rpr < L c < i- 

~t 



lim 

t— too 

lim l0gF ^>^ = -A, 

t— s-oo £ 



nAi(/3,n) + A 
n+1 



which, by combining Equation 18 and using the same 
technique as in Equation (|16]), yield 



lim ^ > t] = - min ( inf ( !^MM1±^ > , \ 
t-toc t (neN I n + 1 

= -mm{A 1 ((3,U),X}. 



E. Proof of Theorem^ 

proof of Theorem [4} 1) By Lemma [3] we know that 
when (3 > 7 

P[JV)>n|L c = fc] = (i- e -'«Ai(fl.n)(x+ fl tf.))^ n j 

with g(Z c ) e o(l) as l c — > 00. Let us denote l n as the root 
of the function l c (l + g(l c )) — A |°| T ^ ■ In other words, 

For any & > 0, we have 

P [Nf > n,l n - z < L c (l n + b ) < l n ] 
>P [N f > n\L c = l n ] ¥ [l n - z < L c (ln + &o) < In] ■ (19) 
Note that, by Lemma [3] 



lim 

n— >oo 

- lim 



log P [Nf >n\L c = l n ] 
logn 

log (1 - e -*«Ai(/3,n)(i+g(; c ))y 
logn 

nlogfl + -) 
= lim P = 0. 

n->oo log n 

Also, by the definition of L c (b), 

logP [/„- z < L c (l n + b ) <l n ] 



(20) 



lim 
: lim 

n—too 

-A 



logn 

logP [/„ - z < L C (Z„ + 6 ) < k] h 



I,,-, 



log n 



Ai (/J,!!)' 



(21) 



Combining Equation ([19]), p0] ) and ( [21] ), we get 

logP [N f >n,l n -z< L c (l n + b ) < l T 



lim 

n— f 00 



log n 



A 



Ai (ftll) 

Therefore, for any n > 0, we can find a 711(77) such that 
log P [Nf >n,l n -z< L c (l n + b Q ) < l n ] 



lim 

n— f 00 



log n 



> 



A 



■(! + >?), 



Ax(An)^ ' (22) 

whenever n > n^n). Also, by Theorem [3] we can find 
n 2 (n) such that 



logP[jV / >n] < 



log n 



Ai09.II) 



(23) 



Let n(?7) = max{ni(a), n2(a)} and 6(77) = l n 0q) + ^o- By 
combining Equation ( [22] ) and ( |23| ), we know that for any 
77 > 0, we can find 6(77) such that for any 6 > 6(77), 

logP[JVY(6) > n] logP[AT f > n ] 
lim sup , < lim 



logn 



logn 



< 



A 



Ai(/3,n) 



(!-»?), 



and 



lira inf 



> lim 



> - 



log P [TV/ (6) > n] 
\ogn 

log P [Nf >n,l n -z< L c (l n + b ) < In] 



log n 



A 



Ai(/3,n) 



whenever n e [n(r)),np\ 
From Lemma P 1 



where n& satisfies 6(1 + g(b)) 
we know that 



Ai(/9,n)- i ~v-^w|3 

n b - e Ai(^n)Ki+ s (fc)) = (p [JV/ = 1)io = 6]) -i 

2) Note that 

logP[JV/(6) > n] 



(24) 



lim 

n— »-oo 



max < lim 

l c In— »-oc 

max < lim 

L I n— >oo 



log (P [L c {b) = l c ,N f {b)>n]) 
n 

log(P [i c (6) = i c ]) 



log(P[iV>(6)>l|Z c (&) = Z c ]) 

= log(P[JV>(6)>l|L c (6) = 6]). 

3,4) The proof of 3) and 4) follows by noting that 
T f (b)=N f (b)L c (b). ' " ■ 

F. Proof of Theorem |5] 

proof of Theorem [5[ Observe that 

A ( 6)=ii m a^ - =f m 

From Theorem [4] we know that for a given r\ > 0, we can 
find n{r\) and b large enough such that 

rn b b 

Epi] > / P[T/(6) > t]dt 

J n(rj)b 
J n(rj)b 

which, by combing the definition of n b in Equation (24), 
yields 

- limsup ^ = liminf M > A^, II) - A. 



VII. Conclusion 

In this paper, we characterize the delay distribution 
in a point-to-point Markovian modulated binary erasure 
channel with variable codeword length. Erasure codes are 
used to encode the information such that a fixed fraction 
of bits in the codeword can recover the information. 
We use a general coding framework called incremental 
redundancy code. In this framework, the codeword is 
divided into several codeword trunks and these codeword 
trunks are transmitted one at a time to the receiver. 



Therefore, the receiver gains extra information, which is 
called incremental redundancy, after each transmission. 
At the receiver end, we investigate two different scenarios, 
namely decoder that uses memory and decoder that does 
not use memory. In the decoder that uses memory case, 
the decoder caches all previously successfully transmitted 
bits. In the decoder that does not use memory case, re- 
ceived bits are discarded if the corresponding information 
cannot be decoded. In both cases, we first assume that the 
distribution of codeword length is light-tailed and has an 
infinite support. Then, we consider a more realistic case 
when the codeword length is upper bounded. 

Our results show the following. The transmission delay 
can be dramatically reduced by allowing the decoder to 
use memory. This is true because the delay is always 
light-tailed when the decoder uses memory while the 
delay can be heavy-tailed when the decoder does not use 
memory. Secondly, analagously to the non-coding case, 
the tail effect of delay distribution persists even if the 
codeword length has a finite support. When the codeword 
length is upper bounded, light-tailed delay distribution 
will turn into a delay distribution with light-tailed main 
body whose decay rate is similar to that of infinite support 
scenario. Further, we show that the waist of this main 
body scales linearly with respect to the increase of max- 
imum codeword length; heavy-tailed delay distribution 
will turn into a delay distribution with heavy-tailed main 
body, whose waist scales exponentially with the increase 
of maximum codeword length. Our results also provide 
a benchmark for quantifying the tradeoff between system 
complexity (which is determined by code-rate (3, number 
of codeword trunks r, maximum codeword length b and 
whether to use memory at the receiver or not) and the 
distribution of delay. 
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